Finding Atmospheric Composition (AC) Metadata 



Searching the world for AC dataset level metadata 






The Atmospheric Composition Portal (ACP) is an aggregator and curator of information related to remotely sensed atmospheric composition data and 
analysis. It uses existing tools and technologies and, where needed, enhances those capabilities to provide interoperable access, tools, and contextual 
guidance for scientists and value-adding organizations using remotely sensed atmospheric composition data. The initial focus is on Essential Climate 
Variables identified by the Global Climate Observing System - CH4, CO, C02, N02, 03, S02 and aerosols. This poster addresses our efforts in 
building the ACP Data Table, an interface to help discover and understand remotely sensed data that are related to atmospheric composition science and 
applications. We harvested GCMD, CWIC, GEOSS metadata catalogs using machine to machine technologies - OpenSearch, Web Services. We also 
manually investigated the plethora of CEOS data providers portals and other catalogs where that data might be aggregated. This poster is our experience 

of the excellence, variety, and challenges we encountered. 

Conclusions: 

1 . The significant benefits that the major catalogs provide are their machine to machine tools like OpenSearch and Web 
Services rather than any GUI usability improvements due to the large amount of data in their catalog. 

2. There is a trend at the large catalogs towards simulating small data provider portals through advanced services. 
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4. Populating metadata catalogs using IS0191 15 is too complex for data providers to do in a consistent way, difficult to 
parse visually or with XML libraries, and too complex for Java XML binders like CASTOR. 

5 . The ability to search for IDs first and then for data (GCMD and ECHO) is better for machine to machine operations 
rather than the timeouts experienced when returning the entire metadata entry at once . 

6. Metadata harvest and export activities between the major catalogs has led to a significant amount of duplication. 

(This is currently being addressed) 

7 . Most (if not all) Earth science atmospheric composition data providers store a reference to their data at GCMD . 

8 . Our experience showed that dataset level metadata search tools , catalogs and portals are constantly improving - 
some problems that we encountered when we started developing the ACP Data Table have been resolved by metadata 
providers and metadata catalog providers . 
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ACP Data Table 

Parameter 

QCH4 

oco 

QC02 

QN02 

Q03 

QS02 

Service 

□ FTP 

□ HTTP 
OpeNDAP 

□WCS 

WMS 

Temporal Resolution 

OAII 
6 minute 

Twice per day (daytime and nighttime) 
Daily 

Once per 8 days (daytime and nighttime) 
Monthly 


Spatial Resolution 
OAII 

0.25 x 0.25 deg 
2)1 x 1 deg 
1.25 x 1 deg 
1.25x 1.25 deg 
50 km x 50 km 
110 km x 110 km 
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Search: 


Advantages 


Dataset Name A 

Data Provider 

CH4 

CO 

C02 

N02 

03 

S02 

Temporal 

Resolution 

Spatial 

Resolution 

FTP 

HTTP 

OpeNDAP 

WCS 

WMS 

Notes 

Metadata 

Source 

Atmospheric C02 
from flask air 
samples at 10 
sites in the SIO 
air sampling 
network, from 
CDIAC/Trends 

DOE/ORN L/ESD/CDIAC 



✓ 




monthly, 

weekly 


/ 






Automated 
GESDISC 
retrieval of 
all GCMD 
info 

11,000 Year 
Sunspot Number 
Reconstruction 

WDC/PALEOCLIMATOLOGY 



/ 




decadal 








Automated 
GESDISC 
retrieval of 
all GCMD 
info 

2000 Pilot 
Environmental 
Sustainability 
Index (ESI) 


✓ 


✓ 



✓ 



/ 

ECHO, 





Automated 
GESDISC 
retrieval of 
all GCMD 
info 

2001 

Environmental 
Sustainability 
Index (ESI) 

SEDAC 

✓ 


✓ 



/ 



/ 

ECHO, 





Automated 
GESDISC 
retrieval of 
all GCMD 
info 

2002 

Environmental 
Sustainability 
Index (ESI) 

SEDAC 

✓ 


✓ 



✓ 



/ 

ECHO, 





Automated 
GESDISC 
retrieval of 
all GCMD 
info 

Showing 1 to 5 of 2,1 62 entries (filtered from 2,1 64 total entries) 
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GCMD - 

CMR/ECHO 

(Global Change Master 
Directory) 


CWIC/GCMD 

(CEOS WGISS Integrated 
Catalog) 


FEDEO 


Multiple Formats 
and Algorithms 


Complex XML 
Format 


Duplication 


Multiple sites, different 
response formats etc. 
in each data provider's 
portal are difficult to 
deal with 


ISO 19115 cannot 
be parsed by XML 
binding engines like 
CASTOR so parsing 
is left to XML 
libraries. 

Data providers 
interpret these 
complex XML 
formats differently 
and put the same 
data in different 
parts of the schema 


Large Catalogs 
harvest and 
populate each 
other's sites 
leading to 
identical data 
products listed 
several times 


http://gcmdservices.gsfc.nasa.gov/mws/entryids/dif? 

query=[Data_Center:%20Short_Name=%27* 

%27]&format=XML 


http://gcmd.gsfc.nasa.gov/KeywordSearch/ 
OpenSearch. do? 

searchTerms=ozone&output=atom&count=1000&startl 

ndex=l&startPage=l&Portal=cwic 


(Federated Earth Observation 
missions access) (ESA 
gateway to certain data) 


http://fedeo.esa.int/opensearch/request/? 

httpAccept=application/atom 

%2Bxml&recordSchema=iso&startRecord=l&maximu 

mRecords=2000&query=ozone 



GEOSS 

(Global Earth Observation 
System of Systems) 


http://production.geodab.eu/gi-cat-StP/services/ 

opensearch? 

&ct= 1 00&st=ozone&ts=2002-0 1 -0 1 T00:00:00Z&te=20 
10-01 -01 T00:00:00Z 




ACADIS, ANZ, ARL, 
BODC, CCHDO, 
CDIAC 

CNDP, EFI, EPA 
ESA, ESPO (NASA), 
GA Tech 

GEIA, GESDISC... 


...JAMSTEC, ,JP NIPR, KOPRI, 
LARC, NCDC, N^IDC, ORNL, 
Palmer Sfatijm, SEDAC, SOLAS, 
UCAR, U o^iami, UNEP, USDA, 
USGS, US GLOBEC, 

US JGOFS, UTM 
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A pleasant, 
focused user 
experience is 
easy because it 
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metadata record so 
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